Goto

Collaborating Authors

 Djelfa


Dhati+: Fine-tuned Large Language Models for Arabic Subjectivity Evaluation

Bellaouar, Slimane, Nehar, Attia, Souffi, Soumia, Bouameur, Mounia

arXiv.org Artificial Intelligence

Despite its significance, Arabic, a linguistically rich and morphologically complex language, faces the challenge of being under-resourced. The scarcity of large annotated datasets hampers the development of accurate tools for subjectivity analysis in Arabic. Recent advances in deep learning and Transformers have proven highly effective for text classification in English and French. This paper proposes a new approach for subjectivity assessment in Arabic textual data. To address the dearth of specialized annotated datasets, we developed a comprehensive dataset, AraDhati+, by leveraging existing Arabic datasets and collections (ASTD, LABR, HARD, and SANAD). Subsequently, we fine-tuned state-of-the-art Arabic language models (XLM-RoBERTa, AraBERT, and ArabianGPT) on AraDhati+ for effective subjectivity classification. Furthermore, we experimented with an ensemble decision approach to harness the strengths of individual models. Our approach achieves a remarkable accuracy of 97.79\,\% for Arabic subjectivity classification. Results demonstrate the effectiveness of the proposed approach in addressing the challenges posed by limited resources in Arabic language processing.


Arabic Multimodal Machine Learning: Datasets, Applications, Approaches, and Challenges

Haouhat, Abdelhamid, Bellaouar, Slimane, Nehar, Attia, Cherroun, Hadda, Abdelali, Ahmed

arXiv.org Artificial Intelligence

Multimodal Machine Learning (MML) aims to integrate and analyze information from diverse modalities, such as text, audio, and visuals, enabling machines to address complex tasks like sentiment analysis, emotion recognition, and multimedia retrieval. Recently, Arabic MML has reached a certain level of maturity in its foundational development, making it time to conduct a comprehensive survey. This paper explores Arabic MML by categorizing efforts through a novel taxonomy and analyzing existing research. Our taxonomy organizes these efforts into four key topics: datasets, applications, approaches, and challenges. By providing a structured overview, this survey offers insights into the current state of Arabic MML, highlighting areas that have not been investigated and critical research gaps. Researchers will be empowered to build upon the identified opportunities and address challenges to advance the field.


AGI Enabled Solutions For IoX Layers Bottlenecks In Cyber-Physical-Social-Thinking Space

Khelloufi, Amar, Ning, Huansheng, Dhelim, Sahraoui, Ding, Jianguo

arXiv.org Artificial Intelligence

The integration of the Internet of Everything (IoX) and Artificial General Intelligence (AGI) has given rise to a transformative paradigm aimed at addressing critical bottlenecks across sensing, network, and application layers in Cyber-Physical-Social Thinking (CPST) ecosystems. In this survey, we provide a systematic and comprehensive review of AGI-enhanced IoX research, focusing on three key components: sensing-layer data management, network-layer protocol optimization, and application-layer decision-making frameworks. Specifically, this survey explores how AGI can mitigate IoX bottlenecks challenges by leveraging adaptive sensor fusion, edge preprocessing, and selective attention mechanisms at the sensing layer, while resolving network-layer issues such as protocol heterogeneity and dynamic spectrum management, neuro-symbolic reasoning, active inference, and causal reasoning, Furthermore, the survey examines AGI-enabled frameworks for managing identity and relationship explosion. Key findings suggest that AGI-driven strategies, such as adaptive sensor fusion, edge preprocessing, and semantic modeling, offer novel solutions to sensing-layer data overload, network-layer protocol heterogeneity, and application-layer identity explosion. The survey underscores the importance of cross-layer integration, quantum-enabled communication, and ethical governance frameworks for future AGI-enabled IoX systems. Finally, the survey identifies unresolved challenges, such as computational requirements, scalability, and real-world validation, calling for further research to fully realize AGI's potential in addressing IoX bottlenecks. we believe AGI-enhanced IoX is emerging as a critical research field at the intersection of interconnected systems and advanced AI.


Efficient $k$-NN Search in IoT Data: Overlap Optimization in Tree-Based Indexing Structures

Benrazek, Ala-Eddine, Kouahla, Zineddine, Farou, Brahim, Seridi, Hamid, Kemouguette, Ibtissem

arXiv.org Artificial Intelligence

The proliferation of interconnected devices in the Internet of Things (IoT) has led to an exponential increase in data, commonly known as Big IoT Data. Efficient retrieval of this heterogeneous data demands a robust indexing mechanism for effective organization. However, a significant challenge remains: the overlap in data space partitions during index construction. This overlap increases node access during search and retrieval, resulting in higher resource consumption, performance bottlenecks, and impedes system scalability. To address this issue, we propose three innovative heuristics designed to quantify and strategically reduce data space partition overlap. The volume-based method (VBM) offers a detailed assessment by calculating the intersection volume between partitions, providing deeper insights into spatial relationships. The distance-based method (DBM) enhances efficiency by using the distance between partition centers and radii to evaluate overlap, offering a streamlined yet accurate approach. Finally, the object-based method (OBM) provides a practical solution by counting objects across multiple partitions, delivering an intuitive understanding of data space dynamics. Experimental results demonstrate the effectiveness of these methods in reducing search time, underscoring their potential to improve data space partitioning and enhance overall system performance.


Riemannian Geometry-Based EEG Approaches: A Literature Review

Tibermacine, Imad Eddine, Russo, Samuele, Tibermacine, Ahmed, Rabehi, Abdelaziz, Nail, Bachir, Kadri, Kamel, Napoli, Christian

arXiv.org Artificial Intelligence

The application of Riemannian geometry in the decoding of brain-computer interfaces (BCIs) has swiftly garnered attention because of its straightforwardness, precision, and resilience, along with its aptitude for transfer learning, which has been demonstrated through significant achievements in global BCI competitions. This paper presents a comprehensive review of recent advancements in the integration of deep learning with Riemannian geometry to enhance EEG signal decoding in BCIs. Our review updates the findings since the last major review in 2017, comparing modern approaches that utilize deep learning to improve the handling of non-Euclidean data structures inherent in EEG signals. We discuss how these approaches not only tackle the traditional challenges of noise sensitivity, non-stationarity, and lengthy calibration times but also introduce novel classification frameworks and signal processing techniques to reduce these limitations significantly. Furthermore, we identify current shortcomings and propose future research directions in manifold learning and riemannian-based classification, focusing on practical implementations and theoretical expansions, such as feature tracking on manifolds, multitask learning, feature extraction, and transfer learning. This review aims to bridge the gap between theoretical research and practical, real-world applications, making sophisticated mathematical approaches accessible and actionable for BCI enhancements.


Adapting Mental Health Prediction Tasks for Cross-lingual Learning via Meta-Training and In-context Learning with Large Language Model

Lifelo, Zita, Ning, Huansheng, Dhelim, Sahraoui

arXiv.org Artificial Intelligence

Timely identification is essential for the efficient handling of mental health illnesses such as depression. However, the current research fails to adequately address the prediction of mental health conditions from social media data in low-resource African languages like Swahili. This study introduces two distinct approaches utilising model-agnostic meta-learning and leveraging large language models (LLMs) to address this gap. Experiments are conducted on three datasets translated to low-resource language and applied to four mental health tasks, which include stress, depression, depression severity and suicidal ideation prediction. we first apply a meta-learning model with self-supervision, which results in improved model initialisation for rapid adaptation and cross-lingual transfer. The results show that our meta-trained model performs significantly better than standard fine-tuning methods, outperforming the baseline fine-tuning in macro F1 score with 18\% and 0.8\% over XLM-R and mBERT. In parallel, we use LLMs' in-context learning capabilities to assess their performance accuracy across the Swahili mental health prediction tasks by analysing different cross-lingual prompting approaches. Our analysis showed that Swahili prompts performed better than cross-lingual prompts but less than English prompts. Our findings show that in-context learning can be achieved through cross-lingual transfer through carefully crafted prompt templates with examples and instructions.


Fuzzy hyperparameters update in a second order optimization

Bensadok, Abdelaziz, Babar, Muhammad Zeeshan

arXiv.org Artificial Intelligence

This research will present a hybrid approach to accelerate convergence in a second order optimization. An online finite difference approximation of the diagonal Hessian matrix will be introduced, along with fuzzy inferencing of several hyperparameters. Competitive results have been achieved


Selective Task offloading for Maximum Inference Accuracy and Energy efficient Real-Time IoT Sensing Systems

Sada, Abdelkarim Ben, Khelloufi, Amar, Naouri, Abdenacer, Ning, Huansheng, Dhelim, Sahraoui

arXiv.org Artificial Intelligence

The recent advancements in small-size inference models facilitated AI deployment on the edge. However, the limited resource nature of edge devices poses new challenges especially for real-time applications. Deploying multiple inference models (or a single tunable model) varying in size and therefore accuracy and power consumption, in addition to an edge server inference model, can offer a dynamic system in which the allocation of inference models to inference jobs is performed according to the current resource conditions. Therefore, in this work, we tackle the problem of selectively allocating inference models to jobs or offloading them to the edge server to maximize inference accuracy under time and energy constraints. This problem is shown to be an instance of the unbounded multidimensional knapsack problem which is considered a strongly NP-hard problem. We propose a lightweight hybrid genetic algorithm (LGSTO) to solve this problem. We introduce a termination condition and neighborhood exploration techniques for faster evolution of populations. We compare LGSTO with the Naive and Dynamic programming solutions. In addition to classic genetic algorithms using different reproduction methods including NSGA-II, and finally we compare to other evolutionary methods such as Particle swarm optimization (PSO) and Ant colony optimization (ACO). Experiment results show that LGSTO performed 3 times faster than the fastest comparable schemes while producing schedules with higher average accuracy.


IP-UNet: Intensity Projection UNet Architecture for 3D Medical Volume Segmentation

Aung, Nyothiri, Kechadi, Tahar, Chen, Liming, Dhelim, Sahraoui

arXiv.org Artificial Intelligence

CNNs have been widely applied for medical image analysis. However, limited memory capacity is one of the most common drawbacks of processing high-resolution 3D volumetric data. 3D volumes are usually cropped or downsized first before processing, which can result in a loss of resolution, increase class imbalance, and affect the performance of the segmentation algorithms. In this paper, we propose an end-to-end deep learning approach called IP-UNet. IP-UNet is a UNet-based model that performs multi-class segmentation on Intensity Projection (IP) of 3D volumetric data instead of the memory-consuming 3D volumes. IP-UNet uses limited memory capability for training without losing the original 3D image resolution. We compare the performance of three models in terms of segmentation accuracy and computational cost: 1) Slice-by-slice 2D segmentation of the CT scan images using a conventional 2D UNet model. 2) IP-UNet that operates on data obtained by merging the extracted Maximum Intensity Projection (MIP), Closest Vessel Projection (CVP), and Average Intensity Projection (AvgIP) representations of the source 3D volumes, then applying the UNet model on the output IP images. 3) 3D-UNet model directly reads the 3D volumes constructed from a series of CT scan images and outputs the 3D volume of the predicted segmentation. We test the performance of these methods on 3D volumetric images for automatic breast calcification detection. Experimental results show that IP-Unet can achieve similar segmentation accuracy with 3D-Unet but with much better performance. It reduces the training time by 70\% and memory consumption by 92\%.


Flickr Africa: Examining Geo-Diversity in Large-Scale, Human-Centric Visual Data

Naggita, Keziah, LaChance, Julienne, Xiang, Alice

arXiv.org Artificial Intelligence

Biases in large-scale image datasets are known to influence the performance of computer vision models as a function of geographic context. To investigate the limitations of standard Internet data collection methods in low- and middle-income countries, we analyze human-centric image geo-diversity on a massive scale using geotagged Flickr images associated with each nation in Africa. We report the quantity and content of available data with comparisons to population-matched nations in Europe as well as the distribution of data according to fine-grained intra-national wealth estimates. Temporal analyses are performed at two-year intervals to expose emerging data trends. Furthermore, we present findings for an ``othering'' phenomenon as evidenced by a substantial number of images from Africa being taken by non-local photographers. The results of our study suggest that further work is required to capture image data representative of African people and their environments and, ultimately, to improve the applicability of computer vision models in a global context.